Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX
Identifieur interne : 000421 ( Main/Exploration ); précédent : 000420; suivant : 000422Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX
Auteurs : Lorena Etcheverry [Uruguay] ; Shahan Khatchadourian [Canada, États-Unis] ; Mariano Consens [Canada, États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.
Abstract
Abstract: The functional genomics and informatics community has made extensive microarray experimental data available online, facilitating independent evaluation of experiment conclusions and enabling researchers to access and reuse a growing body of gene expression knowledge. While there are several data-exchange standards, numerous microarray experiment datasets are published using the MAGE-ML XML schema. Assessing the quality of published experiments is a challenging task, and there is no consensus among microarray users on a framework to measure dataset quality. In this paper, we develop techniques based on DescribeX (a summary-based visualization tool for XML) that quantitatively and qualitatively analyze MAGE-ML public collections, gaining insights about schema usage. We address specific questions such as detection of common instance patterns and coverage, precision of the experiment descriptions, and usage of controlled vocabularies. Our case study shows that DescribeX is a useful tool for the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of MAGE-ML datasets.
Url:
DOI: 10.1007/978-3-642-15120-0_15
Affiliations:
Links toward previous steps (curation, corpus...)
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX</title>
<author><name sortKey="Etcheverry, Lorena" sort="Etcheverry, Lorena" uniqKey="Etcheverry L" first="Lorena" last="Etcheverry">Lorena Etcheverry</name>
</author>
<author><name sortKey="Khatchadourian, Shahan" sort="Khatchadourian, Shahan" uniqKey="Khatchadourian S" first="Shahan" last="Khatchadourian">Shahan Khatchadourian</name>
</author>
<author><name sortKey="Consens, Mariano" sort="Consens, Mariano" uniqKey="Consens M" first="Mariano" last="Consens">Mariano Consens</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:A54568780D1B1D763EFA6EF758DC50E82472A1C5</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15120-0_15</idno>
<idno type="url">https://api.istex.fr/document/A54568780D1B1D763EFA6EF758DC50E82472A1C5/fulltext/pdf</idno>
<idno type="wicri:Area/Main/Corpus">001516</idno>
<idno type="wicri:Area/Main/Curation">001291</idno>
<idno type="wicri:Area/Main/Exploration">000421</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX</title>
<author><name sortKey="Etcheverry, Lorena" sort="Etcheverry, Lorena" uniqKey="Etcheverry L" first="Lorena" last="Etcheverry">Lorena Etcheverry</name>
<affiliation><wicri:noCountry code="subField">Universidad de la República</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Uruguay</country>
</affiliation>
</author>
<author><name sortKey="Khatchadourian, Shahan" sort="Khatchadourian, Shahan" uniqKey="Khatchadourian S" first="Shahan" last="Khatchadourian">Shahan Khatchadourian</name>
<affiliation wicri:level="4"><country>Canada</country>
<placeName><settlement type="city">Toronto</settlement>
<region type="state">Ontario</region>
</placeName>
<orgName type="university">Université de Toronto</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Consens, Mariano" sort="Consens, Mariano" uniqKey="Consens M" first="Mariano" last="Consens">Mariano Consens</name>
<affiliation wicri:level="4"><country>Canada</country>
<placeName><settlement type="city">Toronto</settlement>
<region type="state">Ontario</region>
</placeName>
<orgName type="university">Université de Toronto</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">A54568780D1B1D763EFA6EF758DC50E82472A1C5</idno>
<idno type="DOI">10.1007/978-3-642-15120-0_15</idno>
<idno type="ChapterID">Chap15</idno>
<idno type="ChapterID">15</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The functional genomics and informatics community has made extensive microarray experimental data available online, facilitating independent evaluation of experiment conclusions and enabling researchers to access and reuse a growing body of gene expression knowledge. While there are several data-exchange standards, numerous microarray experiment datasets are published using the MAGE-ML XML schema. Assessing the quality of published experiments is a challenging task, and there is no consensus among microarray users on a framework to measure dataset quality. In this paper, we develop techniques based on DescribeX (a summary-based visualization tool for XML) that quantitatively and qualitatively analyze MAGE-ML public collections, gaining insights about schema usage. We address specific questions such as detection of common instance patterns and coverage, precision of the experiment descriptions, and usage of controlled vocabularies. Our case study shows that DescribeX is a useful tool for the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of MAGE-ML datasets.</div>
</front>
</TEI>
<affiliations><list><country><li>Canada</li>
<li>Uruguay</li>
<li>États-Unis</li>
</country>
<region><li>Ontario</li>
</region>
<settlement><li>Toronto</li>
</settlement>
<orgName><li>Université de Toronto</li>
</orgName>
</list>
<tree><country name="Uruguay"><noRegion><name sortKey="Etcheverry, Lorena" sort="Etcheverry, Lorena" uniqKey="Etcheverry L" first="Lorena" last="Etcheverry">Lorena Etcheverry</name>
</noRegion>
</country>
<country name="Canada"><region name="Ontario"><name sortKey="Khatchadourian, Shahan" sort="Khatchadourian, Shahan" uniqKey="Khatchadourian S" first="Shahan" last="Khatchadourian">Shahan Khatchadourian</name>
</region>
<name sortKey="Consens, Mariano" sort="Consens, Mariano" uniqKey="Consens M" first="Mariano" last="Consens">Mariano Consens</name>
</country>
<country name="États-Unis"><noRegion><name sortKey="Khatchadourian, Shahan" sort="Khatchadourian, Shahan" uniqKey="Khatchadourian S" first="Shahan" last="Khatchadourian">Shahan Khatchadourian</name>
</noRegion>
<name sortKey="Consens, Mariano" sort="Consens, Mariano" uniqKey="Consens M" first="Mariano" last="Consens">Mariano Consens</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/ParkinsonV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000421 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000421 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sante |area= ParkinsonV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:A54568780D1B1D763EFA6EF758DC50E82472A1C5 |texte= Quality Assessment of MAGE-ML Genomic Datasets Using DescribeX }}
This area was generated with Dilib version V0.6.23. |